[Python] Best strategy for dealing with incomplete lines of data from a file.
Posted
by adoran
on Stack Overflow
See other posts from Stack Overflow
or by adoran
Published on 2010-06-16T14:12:13Z
Indexed on
2010/06/16
14:22 UTC
Read the original article
Hit count: 127
I use the following block of code to read lines out of a file 'f' into a nested list:
for data in f:
clean_data = data.rstrip()
data = clean_data.split('\t')
t += [data[0]]
strmat += [data[1:]]
Sometimes, however, the data is incomplete and a row may look like this:
['955.159', '62.8168', '', '', '', '', '', '', '', '', '', '', '', '', '', '29', '30', '0', '0']
It puts a spanner in the works because I would like Python to implicitly cast my list as floats but the empty fields '' cause it to be cast as an array of strings (dtype: s12).
I could start a second 'if' statement and convert all empty fields into NULL (since 0 is wrong in this instance) but I was unsure whether this was best.
- Is this the best strategy of dealing with incomplete data?
- Should I edit the stream or do it post-hoc?
© Stack Overflow or respective owner